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Abstract 

We  study  the  choice  of  the  Lagrange  multipliers  in  the  successive 
quadratic  programming  method  (SQP)  for  equality  constrained  opti¬ 
mization. 

It  is  known  that  the  augmented  Lagrangian  SQP-Newton  method 
depends  on  the  penalty  parameter  only  through  the  multiplier  in  the 
Hessian  matrix  of  the  Lagrangian  function.  This  effectively  reduces 
the  augmented  Lagrangian  SQP-Newton  method  to  the  Lagrangian 
SQP-Newton  method  where  only  the  multiplier  estimate  depends  on 
the  penalty  parameter.  In  this  work,  we  construct  a  multiplier  es¬ 
timate  that  depends  strongly  on  the  penalty  parameter  and  derive 
a  choice  for  the  penalty  parameter  that  attempts  to  make  the  Hes¬ 
sian  matrix,  restricted  to  the  tangent  space  of  the  constraints,  pos¬ 
itive  definite  and  well  conditioned.  We  demonstrate  that  the  SQP- 
Newton  method  with  this  choice  of  Lagrange  multipliers  is  locally  and 
q-quadratically  convergent.  Considerable  numerical  experimentation 
is  included  and  shows  that  our  approach  merits  further  investigation. 

"This  research  represents  a  portion  of  the  first  author  doctoral  thesis  under  the  supervi¬ 
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1  Introduction 

In  this  work  we  study  the  successive  quadratic  programming  approach  (SQP- 
Newton  method)  to  equality  constrained  nonlinear  optimization.  The  SQP- 
Newton  method  requires  an  approximation  of  the  Lagrange  multipliers  and 
the  solution  to  an  equality  constrained  quadratic  programming  problem  at 
each  iteration.  The  objective  function  of  this  quadratic  problem  is  a  quadratic 
approximation  of  the  Lagrangian  function  (Lagrangian  SQP-Newton  method). 
The  solution  of  this  quadratic  problem  may  not  be  unique  or  may  not  exist  if 
the  Hessian  matrix  of  the  Lagrangian  function  is  not  positive  definite  on  the 
tangent  space  of  the  constraints.  Moreover,  even  when  the  solution  of  this 
quadratic  problem  exists,  it  may  not  be  reliable  if  the  Hessian  matrix  is  not 
well  conditioned.  Several  decades  ago  it  became  fashionable,  promoted  by 
the  work  of  Hestenes  in  1969,  to  work  with  the  augmented  Lagrangian  instead 
of  the  Lagrangian.  Many  researchers  considered  the  augmented  Lagrangian 
SQP-Newton  method  instead  of  the  Lagrangian  SQP-Newton  method.  The 
objective  function  of  the  quadratic  problem  in  the  augmented  Lagrangian 
SQP-Newton  method  is  a  quadratic  approximation  of  the  augmented  La¬ 
grangian  function.  Clearly,  in  this  case,  we  not  only  require  an  approxi¬ 
mation  to  the  Lagrange  multipliers  but  a  value  for  the  penalty  parameter. 
However,  the  choice  of  the  penalty  parameter  turned  out  to  be  an  extremely 
delicate  issue  since  the  augmented  Lagrangian  SQP-Newton  method  is  quite 
sensitive  to  this  choice.  Little  success,  if  any,  was  obtained  in  this  direction 
and  researchers  eventually  abandoned  the  use  of  the  augmented  Lagrangian 
and  returned  to  the  standard  Lagrangian  (choice  of  zero  for  the  penalty  pa¬ 
rameter). 

It  is  known  that  the  augmented  Lagrangian  SQP-Newton  method  depends 
on  the  penalty  parameter  only  through  the  multiplier  in  the  Hessian  matrix  of 
the  Lagrangian  function.  This  effectively  reduces  the  augmented  Lagrangian 
SQP-Newton  method  to  the  Lagrangian  SQP-Newton  method  where  only 


2 


the  multiplier  estimate  depends  on  the  penalty  parameter.  The  objective  of 
the  current  work  is  to  derive  a  choice  for  the  penalty  parameter  so  that  the 
Hessian  matrix,  restricted  to  the  tangent  space  of  the  constraints,  is  positive 
definite  and  well  conditioned.  Moreover,  we  desire  that  this  choice  of  the 
penalty  parameter  does  not  destroy  the  local  and  q-quadratic  convergence 
of  the  Lagrangian  SQP-Newton  method.  This  work  has  similarities  to  Tapia 
[9]  where  he  used  the  penalty  parameter  to  obtain  effective  BFGS  and  DFP 
secant  updates  for  equality  constrained  optimization  in  the  SQP  framework. 

This  paper  is  organized  as  follows.  In  Section  2  we  give  background  ma¬ 
terial  on  the  augmented  Lagrangian  SQP  method  for  equality  constrained 
optimization  and  motivate  our  philosophical  approach  for  picking  the  penalty 
parameter.  We  begin  Section  3  by  posing  an  ideal  constrained  optimization 
problem  whose  solution  gives  the  desired  penalty  parameter.  The  constraint 
consists  of  an  inequality  involving  the  Byrd  and  Nocedal  measure  function. 
Next,  with  an  eye  towards  computation,  we  observe  that  the  Byrd-Nocedal 
function  serves  effectively  as  a  barrier  function  and  replace  our  ideal  problem 
with  an  equivalent  logarithmic  barrier  function  problem.  In  Section  4  this 
barrier  function  problem  is  studied  closely  for  the  special  case  of  equality 
constrained  optimization  with  only  one  constraint.  From  this  analysis  we 
conclude  that  solving  the  barrier  function  problem  would  not  be  numerically 
efficient.  Hence,  we  replace  the  barrier  function  problem  with  a  problem 
that  preserves  some  of  the  flavor,  but  whose  solution  can  be  readily  ob¬ 
tained.  The  procedure  for  obtaining  the  penalty  parameter  for  the  special 
case  of  one  constraint  is  extended  to  many  constraints  in  Section  5.  Our  idea 
is  to  use,  as  the  basis  of  our  extension,  the  one- constraint  approach  developed 
in  Section  4.  We  do  this  in  two  distinct  ways;  the  first  we  call  the  parallel 
approach  and  the  second  we  call  the  sequential  approach.  In  Section  6  we 
establish  the  local  and  q-quadratic  convergence  of  the  SQP-Newton  method 
with  this  new  choice  of  Lagrange  multipliers.  Finally,  in  Section  7  we  present 
numerical  results  using  our  new  choice  of  Lagrange  multipliers  in  the  SQP- 
Newton  method.  The  numerical  results  obtained  by  using  this  new  approach 
for  computing  the  Lagrange  multipliers  are  encouraging.  It  was  possible 
to  achieve  convergence  in  many  examples,  where  the  SQP-Newton  method 
with  well-known  choices  for  the  Lagrange  multipliers  (the  least-squares  mul¬ 
tiplier  estimate  [4],  the  Miele-Cragg-Levy  multiplier  [6],  and  the  multiplier 
associated  with  the  solution  of  the  quadratic  program  (see  Tapia  [10]  ))  did 
not  produce  iteration  sequences  which  converged.  A  significant  part  of  this 
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work  is  the  study  of  the  effectiveness  and  the  robustness  of  this  new  choice 
of  Lagrange  multiplier  estimate  in  the  SQP-Newton  framework.  Hence,  in 
our  numerical  comparisons  we  did  not  embed  the  SQP-Newton  method  in 
a  globalization  strategy.  Our  reason  for  not  doing  this  is  that  we  feel  that 
good  global  behavior  of  the  local  method  speaks  strongly  to  the  effectiveness 
of  our  multiplier  choice. 


2  Preliminaries 

In  this  work,  we  axe  concerned  with  the  nonlinear  equality  constrained  opti¬ 
mization  problem: 


minimize  /(x) 

subject  to  h(x)  =  0  ^ 

where  f  :  IR"  — »  IR  and  h  :  IRn  — >  IRm  are  assumed  to  be  smooth  nonlinear 
functions  and  m  <  n.  The  Lagrangian  function  associated  with  problem  (1) 
is 

^(x,  A)  =  /(x)  +  h(x)T  A;  and  (2) 

the  augmented  Lagrangian  function  associated  with  problem  (1)  is 

L(x,  A,  c)  =  /(x)  -(-  h(x)T A  +  ^ h(x)T h(x ).  (3) 

In  the  above  A  =  (Ax,  A2, ...,  Am)T  is  called  the  Lagrange  multiplier  and  the 
parameter  c  €  IR,  c  >  0  is  called  the  penalty  parameter. 

Throughout  this  work  we  will  assume  that  problem  (1)  has  a  solution 
x*  with  associated  Lagrange  multiplier  A*  and  we  also  assume  the  standard 
assumptions  for  the  analysis  of  Newton’s  method: 

(Al)  /,  hi  €  C2(D),  where  D  is  an  open  convex  neighborhood  of  the  local 
solution  x*  of  problem  (1),  and  V2/  and  V2/q  are  Lipschitz  continuous 
at  x». 

(A2)  Vh(x»)  has  full  rank. 

(A3)  2TV2£(x*,  A*)z  >  0  for  all  z  ^  0  satisfying  Wh(xm)Tz  =  0. 
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Consider  the  augmented  Lagrangian  SQP-Newton  method.  For  a  given 
iterate  ( Xk ,  A*)  and  penalty  parameter  Ck  we  let 

Xk+1  =  Xk  +  AxN 

Afc+i  =  Afc  +  AAjv  (4) 

where  Axn  is  the  solution,  and  AA#  is  the  multiplier  associated  with  the 
solution,  of  the  quadratic  program 

minimize  q( Ax)  =  VxL(xk,  A k,  ck)T Ax  +  \AxTV2xL{xk,  A k,  Ck) Ax  ,  . 
subject  to  Vh(xk)T Ax  -f  h(xk)  =  0.  '  ' 

The  second-order  sufficiency  conditions  for  problem  (5)  are  the  associated 
first  order  necessary  conditions  and 

sTVxL(xki  A*,  Ck)s  >  0  for  all  s^O  such  that  (6) 

Vh(xk)Ts  =  0.  (7) 

Clearly,  if  VlL(xk,  Xk,  c*)  does  not  satisfy  these  sufficiency  conditions,  then 
problem  (5)  may  not  have  a  solution;  however,  even  if  A*,  Ck)  satisfies 

these  conditions,  the  solution  may  not  be  reliable. 

Consider  the  augmented  Lagrangian  SQP-Newton  method.  The  first- 
order  necessary  conditions  associated  with  the  quadratic  program  (5)  are 

(  ^ T^i^ki  Afc,  Ck)  'Vh(xk)  \  (  A xpf  \  ^ xL(xki  Afc,  c^)  \  /q\ 

V  Vh(xk)T  0  )  ^  &\N  )  ~  \  -h(xk)  )  ■  (8) 

It  is  not  difficult  to  see  that  (8)  reduces  to 

(  Vl£(xk,Xk  +  Ckh(xk ))  V^(xfc)  ^  /  AxN  \  _  (  -Vx£(xk,\k)  \ 

1,  VM**)T  0  )  \  AA„  )  ~  {  -h(xk)  )  W 

From  equation  (8)  and  the  equivalence  between  (8)  and  (9)  we  observe  that: 
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•  The  augmented  Lagrangian  SQP  iterate  Xk+ 1  depends  on  the  penalty 
constant  c*  only  through  the  multiplier  estimate  in  the  Hessian  matrix 

V2/(xk,\k  +  ckh(xk)). 

Hence,  we  consider  using,  as  a  Lagrange  multiplier  estimate  in  the  Hessian 
matrix  of  the  Lagrangian  function,  a  quantity  of  the  form 

\k  =  U(xk,\k)  +  Ckh(xk)  (10) 

where  Ck  €  lRm,  Ck  =  diag(ck),  and  U(xk,\k )  is  a  Lagrange  multiplier 
formula  which  does  not  depend  on  c*.  Ideally  we  propose  to  choose  the 
penalty  parameter  in  (10)  so  that: 

(Cl)  the  Hessian  matrix  of  the  Lagrangian  functional  V^£(x*,  A*)  is  positive 
definite  on  the  subspace  {s  6  IRn  :  Wh(xk)Ts  =  0}. 

(C2)  the  Hessian  matrix  V^(xjt,  A*),  restricted  to  the  tangent  space  of  the 
constraints,  is  well  conditioned. 

(C3)  the  local  convergence  properties  of  the  Lagrangian  SQP-Newton  method 
are  maintained. 

We  hope  that  as  a  by-product  we  will  obtain  improved  global  behavior 
of  the  augmented  Lagrangian  SQP-Newton  method.  It  is  worth  mentioning 
that  none  of  the  Lagrange  multiplier  formulas  in  the  literature  guarantee  that 
conditions  (Cl),  (C2)  and  (C3)  are  satisfied. 

3  A  Robust  Choice  of  The  Lagrange  Multi¬ 
pliers 

Consider  ( Xk ,  A*)  the  kth  iterate  of  the  Lagrangian  SQP-Newton  method  for 
problem  (1).  Let  U(x,  A)  be  a  Lagrange  multiplier  formula,  i.e., 
A*  =  U(xm,  A*)  when  (x»,A»)  is  a  stationary  point  of  problem  (1).  Also, 
let  B(xk )  be  a  matrix  whose  columns  form  an  orthonormal  basis  for  the  null 
space  of  Vh(xk)T .  For  c  €  IRm  and  p  G  IR  we  let 

H(c )  :=  B(xk)TV2Axk,U(xk,  h)  +  Ch(xk))B(xk)  (11) 
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and 


(12) 


H(c,p)  :=  H(c)  +  pl 

As  before,  C  denotes  the  diagonal  matrix  diag(c). 

The  task  at  hand  is  to  determine  ck  €  IRm  so  that  the  objectives  (Cl)- 
(C3)  of  Section  2  hold  with  A*  given  by  (10).  Hence,  we  look  for  ck  such  that 
H(ck )  is  positive  definite  and  well  conditioned.  However,  in  general  we  will 
not  be  able  to  attain  this  goal,  so  we  incorporate  a  standard  regularization 
procedure  and  ask  for  ck  and  pk  such  that  H(ck ,  Pk)  is  positive  definite  and 
well  conditioned.  We  then  calculate  our  correction  (Aa:,AA)  to  (xk,\k)  as 
the  solution  of  the  linear  system 

Dk(ck,Pk)  'Vh(xk)  \  f  Ax  \  (  -VJ(xk,U(xk,  A*)) 

Vh(xk)T  0  )  \  AA  )  ~  {  -h(xk) 

with  Dk(ck,  pk)  =  Vl£(xk,  U(xk,Xk)  +  Ckh(xk))  +  pkI. 


The  above  considerations  let  us  to  an  ideal  optimization  problem  of  the 
form 

minimize  f(c,p)  =  \\((Ch(xk))T,p)\\l 

subject  to  H(c,p)£S  ' 

where  S  is  an  appropriate  set  contained  in  the  set  of  symmetric  and  positive 
definite  matrices  (SPD).  We  now  make  an  observation  that  will  facilitate 
our  presentation.  Clearly,  if  a  particular  component  of  the  constraint  vector 
h(xk)  is  zero,  then  the  corresponding  component  of  the  penalty  parameter 
ck  does  not  enter  into  the  multiplier  estimate  (10);  and  hence  does  not  enter 
into  Dk  in  (13)  or  problem  (14).  For  this  reason  we  will  always  choose  this 
component  of  ck  to  be  zero.  Therefore,  problem  (14)  will  be  only  to  select 
components  of  the  penalty  parameter  corresponding  to  nonzero  components 
of  h(xk).  Hence,  without  any  loss  of  generalization  in  the  analysis  of  problems 
of  the  form  (14)  we  will  assume  that  the  vector  h(xk)  has  no  zero  components. 

In  problem  (14)  we  expect  to  define  the  set  S  in  terms  of  the  properties 
of  the  eigenvalues  of  H(c,  p).  Hence  it  is  most  satisfying  that  the  eigenvalues 
of  H(c,p)  are  invariant  with  respect  to  the  choice  of  the  matrix  B(xk)  used 
in  defining  the  regularized  reduced  Hessian  H(c,  p).  This  fact  is  the  topic  of 
the  following  theorem. 
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Theorem  3.1  Let  A  £  IR/"X”)  be  a  symmetric  matrix.  Let  Bi  and 
B2  £  IR/nxfc)  with  n  >  k  be  such  that  each  has  orthonormal  columns  and 
these  two  sets  of  columns  each  form  a  basis  for  the  subspace  V  C  IRn.  Then 
BfABi  and  Bf  AB2  have  the  same  eigenvalues. 

Proof.  It  is  clear  that  Bf  AB2  and  Bf  ABi  are  symmetric  matrices. 
Thus,  the  matrices  Bf  AB2  and  Bf  ABi  have  k  real  eigenvalues  and  the 
corresponding  eigenvectors  form  an  orthonormal  set.  Suppose  A,-  £  IR  is  any 
eigenvalue  of  the  matrix  Bf  AB2  with  corresponding  eigenvector  £  mfc. 
Then, 

B^AB2Xi  =  \{Xi . 

It  is  straightforward  to  see  that  there  exists  an  orthogonal  matrix  Q  £ 
such  that 

B2  =  BiQ. 

Therefore, 

QQt Bl  ABxQxi  =  A  iQxi. 

Since  QQT  =  Ikxk  we  have  that  Bf  ABi(Qxi)  =  A  iQxi  which  implies,  A,-  is 
an  eigenvalue  of  the  matrix  Bj AB\  with  corresponding  eigenvector  Qx{  and 
establishes  the  theorem.  □ 

Recently,  Byrd  and  Nocedal  [1]  introduced  a  measure  function  which  has 
the  flavor  of  the  condition  of  a  matrix  A  €  SPD.  This  function  was  also 
studied  by  Fletcher  [3].  It  is  defined  as: 

=  —  (trace(A)  —  ln(det(v4))),  for  A  £  SPD.  (15) 
n 

The  Byrd-Nocedal  measure  function  (15)  satisfies  the  following  properties: 

(PI)  ^bn(A)  is  a  continuous  and  strictly  convex  function  on  the  set  SPD 
(see  [1]  and  [3]). 

(P2)  'F bn{A )  is  globally  and  uniquely  minimized  by  A  =  I  over  the  set  SPD 
(see  [3]). 

(P3)  ^bn(A)  >  1,  for  any  A  £  SPD  (see  [1]). 

(P4)  \Pb;v(-A)  >  ln(Cond(i4)),  for  any  A  £  SPD 

where  Cond(/t)  denotes  the  2-norm  condition  number  of  the  matrix  A. 
(see  [1]). 
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In  some  sense,  ^btv(A)  can  be  considered  to  be  a  measure  of  the  closeness 
between  A  and  the  identity  matrix. 

Our  ideal  problem,  whose  solution  gives  ( Ck,pk ),  to  be  used  in  Dk(ck,pk ) 
in  the  linear  system  (13)  is 

minimize  f(c,p)  =  ||((C7i(xfc)):r,  p)\\l 
subject  to  H(c,p)€  SPD  (16) 

H>BN(H(c,p))<M 

for  some  choice  of  M  >  1. 

Theorem  3.2  Problem  (16)  has  a  unique  solution. 

Proof.  Clearly  f  is  continuous.  Moreover,  since  we  are  assuming  that  no 
components  of  h(xk)  are  zero,  /  is  also  uniformly  convex.  Furthermore,  the 
constraint  set  is  nonempty,  closed  and  convex.  This  proves  the  theorem. 

□ 

The  remainder  of  this  work  is  concerned  with  finding  a  numerical  ap¬ 
proach  for  solving  problem  (16).  Towards  this  end,  we  replace  problem  (16) 
with  the  barrier  penalty  function  problem 

minimize  /„(c,  p)  =  ||((C'fc(x*))T ,  p)\\\  +  bn(H(c,  p))  ,  . 

subject  to  H(c,  p)  €  SPD  '  ' 

for  some  choice  of  p  >  0. 

Theorem  3.3 

(i)  Problem  (17)  has  a  unique  solution. 

(ii)  If  (c*(/x),  p*(p))  solves  problem  (17),  then  (c*(p),  p*{p))  solves  problem 

(16)  with 

M  =  VBv(V((c'(ft),p‘(M)))- 

Proof.  In  order  to  prove  (i),  consider  any  (c, pi)  such  that  H(c,p )  6 
Let  S  =  {(c,p)  €  IRm+1  :  H{c,p)  €  SPD  and  f„{c,p)  <  M&,p)}. 
consider  the  optimization  problem: 

minimize  /M(c, p)  =  ||((C/i(xfc))T, p)\\\  +  p^Bn(H(c, p)) 
subject  to  ( c,p )  G  S 


SPD. 

Now, 

(18) 
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By  the  continuity,  convexity  of  and  the  fact  that  the  set  S  is  closed 

A  A 

we  have  that  S  is  closed  and  convex.  On  the  other  hand,  S  is  also  bounded 
since  has  the  infinity  property  on  SPD,  i.e.,  /  goes  to  infinity  when  the 
norm  of  of  the  vector  (c,  p)  goes  to  infinity,  for  all  matrices  H(c,p)  in  SPD. 
Moreover,  /M  is  a  strictly  convex  function  on  S.  Thus  the  optimization 

A 

problem  (18)  has  a  unique  solution.  Finally,  since  S  is  contained  in  SPD 
and  fn  is  strictly  convex  on  SPD  we  have  that  the  optimization  problem 
(17)  has  also  a  unique  solution.  The  proof  of  part  (ii)  is  straightforward. 
□ 

4  The  One  Constraint  Case  (m  =  1) 

In  this  section,  we  consider  problem  (1)  with  only  one  constraint  since  it  is  of 
theoretical  importance  for  the  case  of  more  constraints.  For  this  particular 
case,  we  construct  a  constrained  optimization  problem,  that  will  be  solved 
at  each  iteration  of  the  SQP-Newton  method,  to  obtain  an  approximation  to 
the  solution  of  the  barrier  penalty  problem  (17),  with  pk  =  0.  The  reason 
for  considering  pk  =  0  is  that  in  this  way  the  structure  of  the  SQP-Newton 
method  is  preserved  (see  systems  (8)  and  (13)).  Moreover,  we  always  obtain 
better  results  by  first  setting  pk  =  0  and  attempting  to  find  ck  which  satisfies 
our  criterion. 

Suppose  we  are  at  the  kth  iteration  of  the  SQP-Newton  method.  Consider 
the  reduced  Hessian  matrix  H(c)  for  the  one  constraint  case,  i.e., 

H(c)  =  B(xk)TV2xt(xk,U{xk)  +  ch(xk))B(xk) 

=  B(xk)T(W2f(xk)  +  U(xk)V2h(xk))B(xk) 

ch(xk)B(xk)T'V2h(xk)B(xk).  (19) 

We  have  that  all  the  elements  in  (19)  are  known  except  the  value  of  the 
penalty  parameter  c;  so  for  the  sake  of  simplicity,  we  can  write  (19)  as 

H(c)  =  A  +  chkD 

where, 

hk  =  ft(ajfc), 

A  =  B(xkf(V2f(x k)  +  U(xk)V2h(xk))B(xk)  and, 

D  =  B{Xk)TV2h{xk)B{xk). 
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First,  we  characterize  the  set  of  minimizers  of  the  Byrd  and  Nocedal  measure 
function  (15)  over  the  set  SPD. 

Theorem  4.1  Consider  the  optimization  problem 


minimize  ^bn{A  +  chkD) 
H(c)  6  SPD 

If  hk  ^  0,  then  the  solution  c*  of  (20)  satisfies 

trace(H{ck)~l  D  —  D)  =  0. 


(20) 


Proof.  Let  us  differentiate  the  function  'fr bn(H(c))  with  respect  to  the 
penalty  parameter  c 

Yc  (*bn{H(c)))  =  |/ifctrace(D)  -  ^  (ln(det(LT(c))))| .  (21) 

On  the  other  hand,  by  Lemma  1.4  of  Fletcher  [3]  we  have  that 

(9BN(H(c)))  =  {trac e{H{c)~lD  -  D )}  .  (22) 

If  Cfc  is  a  solution  of  problem  (20)  and  hk  ^  0  we  have  that  c*  satisfies  the 
first-order  necessary  conditions,  that  is: 

trace(f/(c*)-1  D  -  D)  =  0.  □  (23) 

From  the  above  result  we  observe  that  if  there  exists  cjt  such  that  H(ck)  =  /, 
then  equation  (23)  is  satisfied  and  this  Ck  is  the  unique  minimizer  of  problem 
(20).  However,  this  is  not  always  possible,  nor  desirable,  and  it  is  not  trivial 
to  obtain  an  expression  for  Ck  from  the  first-order  necessary  condition  (23). 
Therefore,  we  consider  substituting  H(ck )  for  H{ck)~l  in  (23)  to  obtain 

trac  e(H(ck)D  —  D)  =  0.  (24) 

Now,  consider  the  following  equation 

Ckhk  +  ptra,ce(H(ck)D  —  D)  =  0.  (25) 
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The  first  term  on  the  left-hand  side  of  (25)  corresponds  to  the  first-order 
necessary  conditions  for  the  minimization  of  HI  and  we  are  assuming 
hk  7^  0.  Our  next  result  shows  that  there  exits  a  relationship  between  equa¬ 
tion  (25)  and  the  following  optimization  problem: 

minimize  <j>(c)  =  \\chk\\l  +  p\\H(c)  -  I\\2F 
c  €  IR  ^ 

Theorem  4.2  A  necessary  and  sufficient  condition  for  ck  to  be  a  minimizer 
of  problem  (26)  is  that  ck  satisfy  equation  (25). 

Proof.  Observe  that  <f>  ( ck )  =  0  is  equivalent  to 

2  ckh\  +  2phktr&ce(H{ck)D  —  D)  =  0. 

If  hk  ^  0,  and  trace(Z)2)  0  then  this  necessary  condition  becomes 

cffik  +  /itrace(i?(cfe)D  -  D)  =  0.  (27) 

On  the  other  hand,  we  know  that  the  function  (f>  is  uniformly  convex  on  IR. 
Hence,  equation  (27)  is  a  necessary  and  sufficient  condition  for  ck  to  be  a 
minimizer  of  problem  (26).  □ 

Theorem  4.2,  Theorem  4.1,  and  the  fact  that  the  minimizer  of  the  Byrd 
and  Nocedal  measure  function  (15)  over  the  set  SPD  is  the  identity  matrix, 
motivate  us  to  replace  the  barrier  penalty  function  /M  given  in  problem  (17) 
with  the  following  function: 

MA  =  Mi  +  n\\H(c)  -  /Ilf-,  (/<  >  o).  (28) 

It  is  clear,  that  for  the  minimizer  c*  of  <f>  we  can  not  guarantee  that  the 
matrix  H(ck )  is  positive  definite.  For  this  reason,  we  need  to  restrict  the  min¬ 
imization  of  (j>  to  a  set  contained  in  SPD.  Tarazaga  in  [11]  and  [12]  presented 
a  sufficiency  condition  for  a  matrix  A  6  IR"Xn  to  be  positive  definite.  This 
condition  does  not  require  us  to  know  the  eigenvalues  of  the  matrix  A  and 
it  is  an  easy  condition  to  evaluate.  We  state  this  condition  as  the  following 
theorem  (see  [12]). 

Theorem  4.3  (Tarazaga) 

Let  A  €  IRnXn.  If 

T(A)  =  trace(A)  -  (n  -  1)*||.4||f  >  0,  (29) 

then  A  is  positive  definite. 
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In  our  current  application  H{c)  is  an  (n  —  1)  x  (n  —  1)  matrix.  Observe 
that  if  /  is  the  (n  —  1)  x  (n  —  1)  identity  matrix,  then  for  r)  >  0 

T(t]I)  =  V  ((»  - 1)  -  \/(n  “  !)\An  “  2))  ■  (30) 

Let 

n  =  {c  6  IR  :  T{H(c))  >  (n  -  1)  -  y/(n  -  l)y/(n  -  2)}  .  (31) 

The  constrainted  optimization  problem  that  we  propose  to  solve  at  each 
iteration  of  the  SQP-Newton  method  to  obtain  the  penalty  parameter  ck  is 

minimize  <f>(c)  =  ||c/ifc|||  +  p\\H(c)  -  I\\2F  ^ 

subject  to  c  £  ft.  ^ 

Clearly,  by  the  Tarazaga  condition  (29)  any  matrix  in  17,  the  constraint  set  of 
problem  (32),  is  symmetric  and  positive  definite;  however  17  may  be  empty. 
An  obvious  way  to  enlarge  17  is  to  use  T(rjl)  for  rj  £  (0, 1)  instead  of  T(I)  in 
the  definition  of  17  given  by  (31).  In  the  current  study  we  did  not  investigate 
this  option;  it  probably  merits  further  investigation. 

Theorem  4.4  If  ft  /  0,  then  the  optimization  problem  (32)  has  a  unique 
solution. 

Proof.  The  proof  follows  from  the  fact  that  17  is  closed  and  convex  and 
(f>(c)  is  continuous  and  uniformly  convex.  □ 


4.1  Algorithm  for  the  Penalty  Parameter 

Recall  that  17  is  given  by  (31).  The  algorithm  we  propose  for  obtaining  (c,  p) 
at  the  kth  iteration  of  the  SQP-Newton  method,  is  as  follows 

Algorithm  4.5 

Given  p  >  0  do  the  following: 

If  H{ 0)  €  17 

Set  (ck,pk)  =  (0,0) 

Else 

Let  c*k  be  the  solution  of  the  constrained  problem  (32),  if  it  exists. 
Otherwise,  let  c*k  be  the  solution  of  the  unconstrained  problem  (26). 
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If  H(c*k)  €  n 

Take  {ck,pk)  =  (4,0) 

Else 

Take  ( ck,pk )  =  (0,  pk)  where  pk  is  computed  such  that 
H(0)  +  pkI  €  SPD 

End 

End 

Notice  that  in  Algorithm  4.5  we  compute  pk  only  if  H( 0)  ^  f l  and 
H(ck )  ^  Q.  Moreover,  in  this  case  we  take  c*  =  0  and  add  pkI  to  the 
reduced  Hessian  matrix  H( 0).  This  reasoning  comes  from  the  fact  that  ck 
obtained  in  the  first  steps  of  the  algorithm  is  not  a  satisfactory  choice  for  the 
penalty  parameter.  Thus,  in  this  case  it  makes  better  sense  to  correct  the 
matrix  H( 0)  instead  of  the  matrix  H(ck).  In  order  to  compute  pk  we  use  the 
modified  Cholesky  factorization  as  presented  in  Dennis  and  Schnabel  [2]. 


5  An  Explicit  Expression  for  the  Penalty  Pa¬ 
rameter 


In  this  section  we  use  Lagrange  multiplier  theory  to  find  an  explicit  solution 
of  problem  (32)  for  the  case  m  =  1. 

The  system  of  first-order  necessary  conditions  for  problem  (32)  is 


Vc/(cfc,  7) 


T(H(ck)) 
7 (T(H(ck))  -  T(I )) 
7 


> 

> 


2cfc/i£  +  p,(2hktT&ceAD) 
+2ckh2ktv&ce(D2)  —  2hktia,ce(D)) 


-7 


{ 


hktia,ce(D )  — 


(n-2)W(ck)  ) 
2yJ{n  -2  )R(ct)  j 


0 

T(I) 

0 

0 


(33) 

(34) 

(35) 

(36) 


where, 

W(c )  =  2/ifctrace(AD)  -(-  2ch2tia,ce(D 2)  and, 
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R(c)  =  \Jtva,ce(A 2)  +  2chktrace(AD)  +  (c)2/i|trace(Z)2). 

The  first-order  necessary  conditions  will  be  satisfied  if  one  of  the  following 
cases  occurs 
Case  1: 

If  -y  =  0,  then  (35)  and  (36)  are  satisfied  and  thus, 

/x{trace(Z))  —  trac  e(AD)}  ,  . 

Ck  hk  +  iihktra,ce(D2) 

is  a  solution  of  the  system  of  first-order  necessary  conditions  ((33),  (34),  (35) 
and  (36))  if  T(H{ck))  >  T(I) 

Case  2: 

If  7  >  0,  then  by  the  complementarity  equation  (35)  we  have  that  any  c* 
satisfying 


T(H(ck))  —  T(I)  =  0  and  such  that,  (38) 

7  >  0  (39) 

is  a  solution  of  the  system  of  first  order  necessary  conditions  for  problem  (32) 
where, 

{2 Ckh\  +  /i{2fefctrace(AJD)  +  2cfcfe|trace(T>2)  —  2hktra,ce(D)}}  2 R(ck) 

2 R(ck)  hktrax:e(D)  —  yj(n  —  2)  W(ck ) 

Our  next  result  characterize  the  solutions  of  the  equation  (38). 


Lemma  5.1  Let 

T(H(c))  =  (trace(A)  +  chktrace(D)  -  T{I)f  -  (n  -  2 )R(c)2. 

Then, 

{c  €  IR  :  T(H(c))  -  T(I)  =  0}  C  {c  e  IR  :  T(H(c))  =  0}. 


Proof. 


Let  us  write 


T(H(c))  -  T(I )  = 


(T(H(c))  -  T(I))(T(H(c))  +  T(I)) 
T(H(c))  +  T(I ) 


(40) 
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If  c  satisfies  T(H(c))  —  T(I )  =  0  then, 

T(H(c))  =  (T(H(c))  -  T(I))(T(H(c))  +  T(I))  =  0  and, 
T(H(c))  +  T(I )  ±  0.  □  (41) 

From  the  above  result  we  have  that  any  zero  of  the  polynomial  of  degree 
two,  T(H(c )),  that  satisfies  equation  (41),  is  a  zero  of  equation  (38). 

6  The  m  >  1  Constraint  Case 

In  this  section  we  extend  Algorithm  4.5  to  the  case  where  the  number  of 
constraints  m  is  greater  than  one.  We  present  two  distinct  algorithms  for  de¬ 
termining  a  penalty  vector  ck  in  the  Lagrange  multiplier  formula  (10)  keeping 
objectives  (Cl),  (C2)  and  (C3)  in  mind. 

In  order  to  introduce  the  first  algorithm  we  write  the  expression  of  the 
reduced  Hessian  matrix  of  the  Lagrangian  function  H (c)  when  the  multiplier 
is  given  by  (10)  as: 

m  m 

H(c)=A  +  '£,V'kDi+Y,cihiD‘,  (42) 

1=1  1=1 

where, 

hi  =  hi(xk) 

ut  =  KM 

A  =  B(xk)TVlf(xk)B(xk) 

&  =  B(xk)TVlhi(xk)B(xk). 

On  the  other  hand,  we  can  write  (42)  in  the  following  way, 

H(c)  =  ^{tfi(ci)  +  H2(c2)  +  ....  +  tfm(cm)}  (43) 

where, 

Ai  =  A  4-  mUkD'  and  Hi{c )  =  A,  +  mchiD'  for  *  =  1, ...,  m. 
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Let  us  define  the  set  f l  for  the  case  m  >  1  as 


fi  =  {c6  IRm  :  T(H(c))  >  T(J)}, 


and  we  also  define  for  i  =  1,  ...,m 


(44) 


Sli  =  {c  6  IRm  :  T(Hi(c))  >  T(I)},  (45) 


where  T  was  defined  in  Section  4  by  (29).  Now,  we  can  state  an  extension  of 
Algorithm  4.5,  for  computing  the  vector  (ck,pk),  when  m  >  1. 


Algorithm  6.1  (Parallel) 
Given  p  >  0  do  the  following: 

If  H(0)  e  n 

Set  ( ck,pk )  =  (0,0) 

Else 


For  i  =  1,  ...m  solve  the  following  constrained  problems  for  Cj 
(Const)i  =  min^,(c,)  =  ||c,A,|||  +  p\\ Hfa)  -  I\\2F 

Ci  6  fi* 

If  Hi(c i)  e  n, 

Set  c*  =  Ci 

Else 

Let  c*  be  the  solution  of  unconstrained  problem 
(Unconst )i  =  min<^t(c,)  =  ||ctAt-|||  +  p\\ Lf,(c,)  -  I\\2F 

End 


End 

Set 


if  h(c*)  €  n 

Set  (cfc,  pk)  =  (c*,  0) 


Else 


Set  (c*;,  pk)  =  (0,  pk)  where  pk  is  computed  such  that 
H( 0)  +  pkI  €  SPD 

End 


End 


Notice  that  Algorithm  6.1  is  actually  a  parallel  version  of  Algorithm  4.5, 
given  in  Section  4,  since  it  is  possible  to  solve  each  problem  (Const),-  for 
i  =  1  ,..m  independently. 
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In  order  to  state  another  version  of  Algorithm  4.5  we  write  the  reduced 
Hessian  matrix  of  the  Lagrangian  function,  at  iteration  k ,  of  the  SQP-Newton 
method  as: 


H(c) 


11  i  m  m 

Ao  +  (-A  +  -A  +  ...  +  -A)  +  Y:  UiD{  +  £  CihiD* 

mm  m  ££  ,=1 

1  m 

Am-2  +  -A  +  U?Dm+  £  ^D> 

m  i=m-l 

Am- 1  +  cmhmDm 


(46) 


where, 

Aq  =  — A  +  UlD 1 
m 

Ai  =  —A  +  Ai-i  +  Ui+1Di+1  +  c'khiD'  i  =  l,...,m-l 
m 


Let  us  denote,  for  i  =  1,  ...,m 

Hi(c)  =  A,_i  +  chiD'  and  (47) 

a  =  {cemm  :T(Hi{c))>-T{I)},  (48) 

m 

where  T  is  given  by  (29). 

We  will  use  the  expression  (46)  for  H(ck)  and  the  notation  (44),  (47)  and 
(48)  to  state  another  extension  of  Algorithm  4.5  when  m  >  1. 


Algorithm  6.2  (Sequential) 

For  a  given  value  of  the  constant  /z  >  0  do  the  following: 

If  77(0)  G  Cl 

Set  (ck,  pk)  =  (0, 0) 

Else 

For  i  =  1,  ...m  solve  the  constrained  problems  for  c, 

(Const);  =  min^,(c,)  =  ||c,-A,-|||  +  /x||77,(ci)  -  ± I\\p 

G 
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If  Hi(ci)  €  ft, 

Set  c*  =  c,- 

Else 

Let  c*  be  the  solution  of  unconstrained  problem 
(Unconst)j  =  min<£,(c;)  =  \\cihi\\l  +  p\\Hi(ci)  -  ±I\\f 

End 

Set  Ai  =  ~^A  +  Ai-i  -j-  c*hiD' 

End 

Set  c*  =  (cj,  c%, cj^) 

If  f/(c*)  €  ft 

Set  (cfc,  /9fe)  =  (c*,  0) 

Else 

Set  ( Ck,pk )  —  (0 ,pk)  where  pk  is  computed  such  that 

H( 0)  +  PkI  €  SPD 

End 

End 

Algorithm  6.2  is  a  sequential  version  of  Algorithm  4.5,  since  the  solution 
of  problem  (Const),+1  depends  on  the  solution  of  problem  (Const),  for  i  = 
1, ..,  m  —  1. 

7  Local  and  q- Quadratic  Convergence 

In  this  section  we  study  the  local  convergence  properties  of  the  SQP-Newton 
method  with  the  Lagrange  multiplier  estimate  given  by  formula  (10).  We 
present  our  convergence  analysis  in  terms  of  a  generic  choice  for  the  penalty 
parameter  c  in  formula  (10).  Towards  this  end  let  us  begin  with  the  following 
definition. 

Definition  Let  x*  be  a  stationary  point  of  problem  (1)  and  consider  the 
penalty  choice  function  c  :  IRn  — >•  IRm .  We  say  that  c  is  locally  bounded  at  x* 
if  there  exists  N(x+),  a  neighborhood  of  x*,  such  that  c(N(xm))  is  a  bounded 
subset  ofJRm 

In  addition  to  the  standard  Newton’s  method  assumptions  (A1)-(A3),  our 
convergence  theory  will  require  the  boundedness  assumption: 
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(A4)  The  penalty  choice  function  c  is  locally  bounded  at  x*. 

We  demonstrate  local  convergence  and  quadratic  convergence  for  the 
SQP-Newton  method  using  the  Lagrange  multiplier  choice  (10)  under  the 
standard  assumptions  (A1)-(A3)  and  the  boundedness  assumption  (A4). 

First,  we  present  a  useful  general  perturbation  lemma  that  resembles 
Lemma  11.2.2.  in  Ortega  and  Rheinboldt  [7].  However,  our  result  is  more 
general  in  the  sense  that  we  consider  perturbations  to  both  the  Jacobian 
matrix  and  the  right-hand  side  in  the  Newton  linear  system.  To  begin  with 
consider  the  nonlinear  system 


F(x)  =  0,  (49) 

where  F  :  IRn  — >  IRn.  We  will  use  the  notation  J(x )  =  F  (x).  Now,  consider 
Newton’s  method  applied  to  (49) 

x+  =  x  —  J(x)-1F(x),  (50) 

and  a  perturbation  of  this  method,  say 

x+  =  x  -  (J(x)  +  J(x))-1(F(x)  +  F(x)),  (51) 

where  J  :  IRn  — >  IRnxn  and  F  :  IR"  — >  IR".  The  standard  assumptions  for 
Newton’s  method  applied  to  system  (49)  are: 

(51)  There  exists  x»  such  that  F(x*)  =  0. 

(52)  The  function  F  is  continuously  differentiable  in  an  open  convex  set  D 
containing  x»,  and  the  Jacobian  operator  J  is  Lipschitz  continuous  in 
D. 

(53)  The  Jacobian  matrix  J(x*)  is  nonsingular. 

Lemma  7.1  (Perturbation  Lemma) 

Assume  the  standard  assumptions  (S1)-(S3).  Assume  further  that  in  a  neigh¬ 
borhood  D  C  D  of  the  local  solution  x* 

(i)  ||J(x)||  =  0(\\x  -  x,||), 

(ii)  ||F1(x)j|  =  0(\\x  —  x«||2). 
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Then,  the  iterative  procedure  (51)  is  locally  and  q-quadratically  convergent  to 
xm. 

Proof.  From  assumption  (i)  and  Theorem  3.6  of  Stewart  [8]  we  have 


x+  -  x.  =  x  -  x,  -  ( J~l{x)  +  W(x)J~1(x))(F(x)  +  F(x)) 


where 


\\W(x)\\  < 


l-||J-Har)J(x)||’ 


for  all  i  in  a  neighborhood  of  x ,.  Hence, 


||*+-*.||  <  ||  J-^xXFX*,)  -  F(x)  -  J(x))(x,  -  x)\\ 

+  ||  J-1(x)||||F(x)||  +  ||W(x)||||  J-1(x)||||F(x)  -  F(x.) || 

+  ll^(x)||||J-1(x)||||F(x)||.  (52) 

By  the  standard  assumptions  (S1)-(S3)  and  assumption  (i)  we  have  that  there 
exists  a  neighborhood  of  x»,  such  that  for  all  x  in  this  neighborhood 

||W(x)||  =  0(||x  —  x»||).  (53) 

Moreover,  from  (52),  (53)  and  assumptions  (i),  (ii)  we  have  that  for  all  x  in 
a  neighborhood  of  x. 


||x+  —  x,||  =  0(||x-x,||2).  (54) 

It  follows  from  (54)  that  the  iterative  procedure  (51)  is  locally  and  quadrat- 
ically  convergent  to  x*.  □ 

In  order  to  utilize  the  perturbation  lemma  we  consider  Newton’s  method 
on  the  nonlinear  system  of  equations 

VxL(x,K,p)  -  0,  (55) 

where  A*  is  the  multiplier  associated  with  the  solution  x*  of  problem  (1) 
and  p  is  a  positive  constant  such  that  the  matrix  V|£-(x*,  A*,  p)  is  positive 
definite.  It  is  well  known  that  such  constants  exist  under  the  assumptions 
(Al)-(A3).  Hence,  our  ideal  algorithm  will  be 

x+  —  x  V2xL(x,  A,,  p)~v VxZ(x,  A„,  p).  (56) 
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Recall  that  the  iterative  procedure  (56)  is  locally  and  q-quadratically  con¬ 
vergent  to  x ». 

The  SQP-Newton  method  with  our  choice  of  the  Lagrange  multiplier 
formula  (10)  can  be  written 

*+  =  x  -  R(x)-‘  {V f(z)  +  Vh(x)Mx)(h(i)  -  Vh(x)T R(i)-'V f(x))) 

=  x-(V’xL(x,A„p)  +  Q(x))-1(VxL(x,A.,p)  +  E(x))  (57) 


where 

R(x)  =  V2xL(x,K,p)  +  Q(x) 

Q(x)  =  V2J(x,  U(x))  -  V2J(x,  A,)  +  Ch(x)V2h(x)  -  ph(x)V2h(x) 
A(x)  =  (V/i(x)Ti2(x)-1V/i(x))_1 

E{x)  =  Vh(x){A(x)(h(x)-'Vh(x)TR(x)-1Vf(x))-K-ph(x)}. 

Hence,  the  iterative  procedure  (57)  has  been  written  as  a  perturbation  of 
the  ideal  Newton  method  given  by  (56),  where  the  perturbations  are  given 
by  Q(x)  and  E(x). 

Our  next  result  demonstrates  that  the  SQP-Newton  method  with  La¬ 
grange  multiplier  estimate  given  by  formula  (10)  satisfies  conditions  (i)  and 
(ii)  of  the  previous  lemma,  whenever  the  approximation  formula  U  in  the 
Lagrange  multiplier  (10)  satisfies  a  mild  condition. 

Lemma  7.2  Let  x*  be  a  local  solution  of  problem  (1)  with  associated  multi¬ 
plier  A*.  Assume  that  in  a  neighborhood  of  the  solution  x* 

||f/(x)  -  A. ||  =  0(||z  -  *.||).  (58) 

Also  assume  the  standard  conditions  (A1)-(A3)  and  the  boundedness  condi¬ 
tion  (A4)-  Then  there  exists  a  neighborhood  N(x+),  of  the  local  solution  x*, 
such  that  for  all  x  €  iV(x*) 

(a)  ||<3(x)||  =  0(||x-x*||), 

(b)  !|£(z)||  =  0(||x-x.j|2). 
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Proof.  From  the  boundedness  condition  (A4),  the  standard  assumption 
(Al)  and  the  condition  (58)  it  follows  that  there  exists  a  neighborhood  of 
the  solution  x*,  such  that  for  all  x  in  that  neighborhood 

IIOWII  <  ||V^(z,  U(x))  -  V\l(x,  A.)||  +  Mil k(x)  -  Mx.)||||V2Mx)|| 
+MIIM*)  -  M*.)||||V2M*)|| 

=  o(||* -*.||)-'  (*») 

Therefore  condition  (a)  holds.  For  the  sake  of  simplicity  we  will  denote  the 
ith  component  of  the  gradient  vector  of  h,  evaluated  at  x,  (V/i,(x))  by  V/»,-. 
Similar  notation  will  be  used  for  similar  quantities.  Using  this  notation  we 
have 


A {h  -  V/^iT1  V/}  -  A .-ph  = 

=  A{P(h-h(x,))-VhTR-l(Vf  +  VhK) 

+  V  hT  R~x  (V  /  (x« )  +  VA(x*)A,)} 

=  a|p  J  V/it(x,  +  t(x  —  x,))(x  —  x+)dt 

—  VhTR~x  /  V2/(x*  +  t{x  —  x*))(x  —  x+)dt 
Jo 

-VhTRTx  u  V2hi(x*  +  t(x  -  x»))(x  -  x.)(A,),<ft  j 

where 

P  =  AreXm  pA 

Adding  and  subtracting  some  terms  conveniently  we  obtain 
A {h  -  VhTRTlVf}  -  A,  -  phk  = 

A  jp  £  Ax(t)(x  -  x*)dt  -  VhTR~x  £  A2(t)(x  -  x,)dt 

-VhTRTx  jr  £  A?(t)(A*),(x  -  x,)dt| 

+AZ(x  -  x.)  (60) 

where 

Ax(t)  =  (V/i(x*  +  t(x  —  x*))  —  V/i(x»))T, 

A2(t)  =  V2/(x*  +  t(x  -  x,))  -  V2/(x„), 

A} (t)  =  V2ft,(x«  +  t(x  -  x.))  -  V2/i,-(x„), 
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and 


Z  =  PVh(x.)T  -  V/iTi?-1V2/(x,)  -  VhTR-'  Y.  V2^(x,)(A*),-.  (61) 

i=i 

By  assumption  (Al)  and  the  fact  that  0  <  t  <  1  we  have  that  in  a  neighbor¬ 
hood  of  the  solution  x* 

l|A*(<)ll  =  0(||*-*.||), 

M2(i)||  =  0(||* -*.||), 

IK’MII  =  0(||*  -*.||).  (62) 

On  the  other  hand,  adding  and  subtracting  terms  to  (61)  we  have 

m 

Z  =  V/iTi?-1{(V2/  —  V2/(x»))  +  X](V2/i,-  —  V2/i,(x„))(A„)f- 

1=1 

m 

+  pVfc(Vfc  -  V/i(x,))T  +  2  V%(Ui  -  (A*),)} 

*=i 

+  (VA(*.)  -  V/i)t{jR“1V2/  +  pR-1VhVhr 

m  m 

+  ^_1  £  W  +  V^(x»)Ti2_1  £  CiV2hi(hi  -  hi{xm)).  (63) 

«=i  »=i 

By  assumptions  (A1)-(A4)  and  condition  (58)  we  have  that  in  a  neighborhood 
of  the  solution  x, 

||Z||  =  0(||* -*.||)  (64) 

From  (60),  (62)  and  (64)  it  follows  that  there  exists  a  neighborhood  of  the 
solution  x*  such  that 

\\E(x)\\  <  ||Vh||!|A{/i  —  VhTf?_1V/}  —  A*  —  ph\\ 

=  0(\\x  —  x*||2).  □ 

Theorem  7.1  Let  x*  be  a  local  solution  of  problem  (1)  with  associated  La¬ 
grange  multiplier  A».  Assume  the  standard  conditions  (Al)-(AS)  and  the 
boundedness  condition  (A4)-  Also  assume  that  in  a  neighborhood  of  the  so¬ 
lution  x* 

||tf(*)-A,||  =  0(||*-*.||).  (65) 

Then  the  SQP-Newton  method  with  the  choice  of  Lagrange  multiplier  given 
by  (10)  is  locally  and  q-quadratically  convergent  to  x*. 
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Proof.  It  is  not  difficult  to  prove  and  well  known  that  the  standard 
assumptions  (A1)-(A3)  imply  the  standard  assumptions  (S1)-(S3)  for  the 
nonlinear  system  of  equations  (55).  On  the  other  hand,  by  Lemma  7.2  con¬ 
ditions  (ii)  and  (iii)  of  Lemma  7.1  are  satisfied.  Therefore,  we  have  the 
assumptions  of  Lemma  7.1  and  thus  the  iterative  procedure  (57)  is  locally 
and  q-quadratically  convergent  to  a:,.  E 

Corollary  7.1  Let  x »  be  a  local  solution  of  problem  (1)  with  associated 
Lagrange  multiplier  A„.  Assume  the  standard  assumptions  (A1)-(A3).  The 
SQP-Newton  method  with  Lagrange  multiplier  estimate  given  by  (10),  where 
the  approximation  formula  U  is  the  least-squares  formula 

U(x)  =  — (V/i(x)TV/i(x))-1  V/i(x)TV/(x),  (66) 

or  the  Miele-Cragg-Levy  formula 

U{x)  =  (S/h(x)TVh(a ;))-1(fe(x)  -  V%)TV/(a :)),  (67) 

with  the  penalty  parameter  c  chosen  accordingly  to  Algorithm  6.1  or  Algo¬ 
rithm  6.2,  is  locally  and  q-quadratically  convergent  to  x*. 

Proof.  The  penalty  vectors  c  obtained  by  Algorithm  6.1  or  Algorithm  6.2 
satisfy  the  boundedness  condition  (A4).  It  is  straightforward  to  prove  that 
under  the  standard  assumptions  (A1)-(A3),  there  exists  a  neighborhood  of 
the  local  solution  x«  such  that 

\\U(x)  -  A* ||  =  0(||x  -  x.||),  (68) 

where  the  approximation  formula  U  is  given  by  formula  (66)  or  by  formula 
(67).  □ 

A  rather  direct  extension  of  these  results  shows  that  the  SQP-Newton 
method  with  Lagrange  multiplier  formula  given  by  (10),  where  formula  U 
is  given  by  the  QP  multiplier  formula  (the  multiplier  associated  with  the 
solution  of  subproblem  (5)  )  is  locally  and  q-quadratically  convergent  in  the 
pair  (x,A);  but  not  necessarily  in  x  alone. 

8  Numerical  Results 

In  this  section  we  discuss  some  issues  concerning  the  implementation  of  the 
SQP-Newton  method  and  present  numerical  results  obtained  from  our  im- 
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plementation  of  the  method. 

The  SQP-Newton  method  that  we  implemented  to  test  our  multiplier 
estimate  is 

Xk+ X  =xk  +  Axk  (69) 

where  Axk  is  the  solution  of  the  linear  system 

Vl£(xk,U(xk)  +  Ckh(xk))  Vh(xk)  \(  Ax  (  -VJ(xkMxk))  \ 

Vh(xk)T  0  )\AX  )  \  -h(xk)  )  ' 

(70) 

The  penalty  vector  Ck  =  diag(cjt)  is  computed  using  Algorithm  6.1  or 
Algorithm  6.2.  We  also  imposed  the  condition 

IM>  <  AT,  (71) 

for  some  positive  constant  M.  We  proceed  in  the  following  way:  whenever 
Cfc  satisfies  (71),  it  is  acceptable.  Otherwise,  we  set  Ck  =  0  ,  and  perform  the 
modified  Cholesky  factorization,  given  in  Dennis  and  Schnabel  [2],  on  the 
matrix  H( 0)  to  obtain  pk. 

The  problems  tested  were  taken  from  Hock  and  Schittkowski  [5]  and  will 
be  referenced  by  the  number  given  there.  The  SQP-Newton  method  with 
the  choice  of  the  multiplier  (10)  was  implemented  in  Matlab  4.0  on  a  Sparc 
station  1.  The  choices  for  U  in  formula  (10)  are: 

ULs(x)  =  —('Vh(x)T'Vh(x))~1'Vh(x)TVf(x)  (72) 

Umcl(x)  =  (V ^(x)TV h(x))~1(h(x)  -  Vh(x)TVf(x))  (73) 

UqP(x,  X)  =  (Vh(x)THWh(x))-1{h{x)-Vh(x)THVf(x))  (74) 

where  the  matrix  H  =  V%£(x,  A)-1. 

In  order  to  study  the  robustness  of  the  Lagrange  multiplier  choice  (10) 
in  the  SQP-Newton  framework  we  tested  each  problem  starting  from  various 
initial  points  xo.  Moreover,  in  order  to  make  uniform  comparisons,  in  all  the 
experiments  we  use  the  same  initial  Lagrange  multiplier 

A°  =  ULS(x  0).  (75) 

The  numerical  results  are  reported  in  Tables  1  through  9.  The  numbers 
in  the  column  labeled  PN  give  the  number  of  the  problem  being  tested. 
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The  numbers  in  the  column  labeled  NIP  give  the  number  of  starting  points 
tested  for  each  particular  problem.  We  choose  the  initial  iterate  that  ap¬ 
pears  in  Hock  and  Schittkowski  [5]  and  several  other  initial  iterates  that 
were  presented  in  Williamson  [13].  The  numbers  in  the  column  under  the  la¬ 
bel  \(xk)  =  U(x k)  +  Ckh(xk),  give  the  number  of  different  starting  points  for 
which  the  algorithm  converged.  For  example,  in  the  column  labeled  c  =  c^, 
p  =  0,  fi  =  500  and  Seq.  appears  the  number  of  starting  points  for  which 
the  SQP-Newton  method,  given  by  (69)  with  p  —  0  and  p  =  500  in  Algo¬ 
rithm  6.2,  the  sequential  version  of  Algorithm  4.5,  converged.  In  this  case 
p  —  0  means  that  we  did  not  add  any  diagonal  matrix  to  the  matrix  H( 0) 
even  when  the  matrix  H(ck)  was  not  positive  definite.  Moreover,  p  =  pk 
means  that  we  added  to  the  diagonal  of  H( 0)  the  matrix  p^I.  The  numbers 
in  the  column  labeled  j  give  the  number  of  iterations  required  to  achieve 
convergence  over  the  number  of  times  a  nonzero  for  p  was  computed. 

In  Tables  1,  2  and  3  the  stopping  criteria  employed  was  either 

<  nr7, 

or  the  number  of  iterations  reached  250.  In  these  tables  the  results  presented 
were  obtained  allowing  the  reduced  Hessian  matrix  H{ck )  to  be  indefinite. 
This  means  that  when  the  reduced  Hessian  matrix  H(ck)  is  not  positive 
definite  and  problem  (5)  may  therefore  not  have  a  solution,  we  obtain  the 
iterate  instead  by  solving  the  extended  system  (70).  In  Tables  4,  5  and  6  we 
did  not  allow  the  reduced  Hessian  matrix  H(ck )  to  be  indefinite.  Therefore, 
we  not  only  used  the  previous  stopping  criteria,  we  also  consider  that  the 
algorithm  failed  if  the  reduced  Hessian  matrix  H(ck)  given  in  (42)  had  an 
eigenvalue  less  than  10-6.  In  Tables  7,  8  and  9  we  did  not  allow  the  reduced 
Hessian  matrix  H(ck)  to  be  indefinite.  However,  in  this  case,  the  stopping 
criteria  was  the  same  as  in  Tables  1,  2  and  3.  For  these  tables,  we  followed 
Algorithm  6.1  or  Algorithm  6.2,  i.e.,  we  added  pkl  to  the  reduced  Hessian 
matrix  H( 0)  when  H(ck )  was  not  positive  definite. 

We  observe  from  Tables  1,  2,  3,  4,  5  and  6  that  we  can  achieve  convergence 
to  a  minimizer  x„,  in  many  problems  from  different  starting  points,  just  by 
computing  the  penalty  vector  c*  proposed  in  this  work.  Also  Tables  4,  5  and 
6  indicate  that  this  new  choice  of  the  multiplier  generates  a  positive  definite 
reduced  Hessian  matrix  more  frequently  than  the  traditional  multiplier  for¬ 
mulas  (least-squares  multiplier  (72),  Miele-Cragg-Levy  multiplier  (73)  and 
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PN 

NIP 

Xk  = 

ULs(*k)  + 

Ckhfak) 

c  =  0 

P  =  o 

c  =  ck 
P=  o 
(i  =  500 
Parallel 

c  =  ck 

P=  o 

VL-  500 
Seq. 

c  =  Cfc 

P  =  o 

H  =  1000 

Parallel 

c  =  c* 

P  =  o 

M  =  1000 

Seq. 

6 

9 

2 

8 

8 

8 

8 

7 

4 

0 

4 

4 

4 

4 

26 

5 

2 

5 

5 

5 

5 

27 

7 

5 

7 

7 

7 

7 

60 

4 

1 

4 

4 

4 

4 

39 

11 

0 

9 

7 

9 

7 

40 

12 

4 

11 

7 

7 

8 

42 

12 

4 

9 

10 

9 

10 

77 

10 

7 

10 

10 

10 

10 

78 

12 

6 

12 

10 

10 

10 

79 

10 

6 

7 

8 

7 

8 

46 

8 

7 

8 

8 

7 

6 

47 

11 

5 

9 

10 

10 

11 

56 

9 

0 

0 

7 

1 

7 

Table  1:  Number  of  starting  points  for  which  the  SQP-Newton  method  con¬ 
verges  (indefinite  reduced  Hessian  allowed). 

the  QP  multiplier  (74  )).  Moreover,  from  Tables  7,  8  and  9  we  observe  that 
the  number  of  times  we  compute  pk  compared  with  the  number  of  times  we 
compute  pk  for  the  traditional  multipliers  is  smaller.  Finally,  we  can  achieve 
convergence  to  a  minimizer  in  almost  all  the  problems  from  most  starting 
points. 

It  is  important  to  mention  that  the  algorithms  do  not  always  converge 
to  the  same  points.  Our  numerical  experiments  seem  to  indicate  that  the 
region  of  local  convergence  is  larger  with  our  choice  of  the  multipliers.  On 
the  other  hand,  our  numerical  results  seems  to  indicate  that  Algorithm  6.1 
and  Algorithm  6.2  are  not  very  sensitive  to  the  choice  of  the  parameter  p. 

So  far,  we  have  considered  the  local  properties  of  the  new  choice  of  mul¬ 
tipliers  (10).  In  the  near  future  we  would  like  to  embed  this  technique  in  a 
globalization  strategy;  and  consider  the  global  properties  of  this  new  choice 
of  the  multipliers. 
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Table  2:  Number  of  starting  points  for  which  the  SQP-Newton  method  con¬ 
verges,  (indefinite  reduced  Hessian  allowed). 


Table  3:  Number  of  starting  points  for  which  the  SQP-Newton  method  con¬ 
verges,  (indefinite  reduced  Hessian  allowed). 
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PN 

NIP 

Afc  = 

umnsa 

H 

c  —  c, 

p  =  0 

H  =  500 
Parallel 

c  =  c* 

P  =  o 

M  =  1000 

Parallel 

c  =  c. 

P  =  o 

/i  =  1000 

Seq. 

6 

9 

2 

7 

7 

7 

7 

7 

4 

0 

4 

4 

4 

4 

26 

5 

4 

4 

4 

4 

27 

mm 

7 

7 

7 

7 

4 

4 

4 

4 

39 

mm 

o 

5 

6 

8 

7 

40 

■  m 

3 

2 

3 

2 

42 

B  9 

10 

10 

77 

mm 

3 

3 

78 

7 

7 

79 

■  M 

5 

6 

46 

0 

i 

0 

0 

47 

2 

10 

8 

2 

4 

56 

9 

0 

0 

0 

0 

0 

Table  4:  Number  of  starting  points  for  which  the  SQP-Newton  method  con¬ 
verges,  (indefinite  reduced  Hessian  not  allowed). 


PJV 

1V/P 

o  o 

II  II 

c  =  c« 

P  =  0 

H  =  500 
Parallel 

c  =  c* 

p  =  o 

M  =  1000 

Parallel 

c  =  c, 

P  =  o 

/i  =  500 
Seq. 

c  =  c« 

P  =  o 

At  =  1000 

Seq. 

6 

9 

3 

7 

6 

7 

6 

7 

4 

0 

4 

4 

4 

4 

26 

5 

2 

4 

4 

0 

27 

5 

7 

7 

7 

60 

4 

4 

4 

39 

6 

6 

7 

6 

2 

2 

2 

3 

9 

9 

10 

10 

3 

3 

3 

2 

6 

7 

7 

7 

7 

5 

4 

4 

5 

5 

8 

0 

0 

0 

0 

0 

11 

2 

4 

4 

5 

4 

9 

0 

0 

0 

0 

0 

Table  5:  Number  of  starting  points  for  which  the  SQP-Newton  method  con¬ 
verges,  (indefinite  reduced  Hessian  not  allowed). 
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PN 

NIP 

■<5  o 

ll  ll 
o  o 

c  =  c» 

P  =  o 

=  500 
Parallel 

c  =  c» 
p  =  o 

M  =  1000 

Parallel 

c  =  c* 
p  =  0 
/i  =  500 
Seq. 

c  =  c, 

P  =  0 
(1  =  1000 
Seq. 

6 

9 

6 

8 

8 

8 

8 

7 

4 

0 

4 

26 

5 

2 

5 

27 

4 

1 

5 

60 

4 

1 

4 

39 

11 

0 

6 

6 

6 

40 

12 

0 

1 

1 

1 

1 

42 

12 

4 

9 

9 

10 

10 

77 

10 

3 

3 

3 

3 

3 

78 

12 

4 

5 

5 

4 

4 

79 

10 

4 

4 

4 

5 

5 

46 

8 

0 

0 

0 

0 

0 

47 

11 

2 

5 

5 

5 

5 

56 

9 

0 

0 

0 

0 

0 

Table  6:  Number  of  starting  points  for  which  the  SQP-Newton  method  con¬ 
verges,  (indefinite  reduced  Hessian  not  allowed). 
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Table  7:  Number  of  starting  points  for  which  the  SQP-Newton  method  con¬ 
verges,  (indefinite  reduced  Hessian  not  allowed). 
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Table  8:  Number  of  starting  points  for  which  the  SQP-Newton  method  con¬ 
verges,  (indefinite  reduced  Hessian  not  allowed). 
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A  t  =  UQp(xk,*k)  +  Ckh(xic) 


PN 

NIP 

ft 

II  II 
® 

it 

P 

c  =  Ck 

P-  Pk 
n  =  500 
Parallel 

it 

P 

C-Ck 

P-Pk 

H  =  1000 

Parallel 

it 

P 

it 

P 

c  =  Ck 

P  =  Pk 

fi  =  1000 

Seq. 

it 

P 

n 

9 

50 

22 

9 

282 

2 

9 

189 

2 

9 

282 

2 

9 

189 

2 

ill 

4 

496 

64 

4 

51 

0 

4 

52 

0 

4 

52 

0 

4 

52 

0 

26 

5 

116 

16 

5 

5 

191 

0 

5 

187 

0 

5 

191 

0 

27 

6 

81 

5 

6 

139 

1 

6 

139 

1 

6 

139 

1 

6 

139 

1 

60 

4 

60 

7 

4 

59 

0 

4 

59 

0 

4 

59 

0 

4 

E3 

39 

11 

219 

$0 

258 

65 

244 

57 

9 

170 

42 

9 

167 

28 

40 

12 

9 

145 

88 

9 

144 

31 

9 

121 

42 

9 

146 

23 

6 

80 

13 

42 

12 

9 

189 

68 

9 

199 

65 

7 

115 

25 

6 

184 

72 

8 

184 

55 

77 

10 

8 

322 

140 

8 

304 

93 

8 

216 

66 

9 

431 

236 

9 

253 

85 

78 

12 

9 

294 

215 

140 

53 

184 

51 

11 

187 

38 

11 

258 

62 

79 

10 

10 

145 

15 

Kv3 

153 

22 

123 

14 

10 

118 

9 

46 

8 

8 

224 

33 

8 

288 

66 

8 

316 

75 

8 

226 

39 

8 

227 

40 

47 

11 

11 

211 

26 

11 

220 

19 

11 

204 

12 

11 

218 

18 

11 

236 

21 

56 

9 

0 

0 

0 

2 

109 

85 

4 

515 

396 

3 

26 

7 

0 

0 

0 

Table  9:  Number  of  starting  points  for  which  the  SQP-Newton  method  con¬ 
verges,  (indefinite  reduced  Hessian  not  allowed). 
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